Fix: count all tool tokens in budget and add summarization UserMessage priority#4992
Merged
Fix: count all tool tokens in budget and add summarization UserMessage priority#4992
Conversation
…ools Deferred tools (defer_loading: true) still count against the API context window. The 3/30 change (#4834) excluded them from toolTokens, causing the message budget to be ~31K tokens too generous and leading to context_length_exceeded errors followed by summarization failures ("No messages provided").
f965703 to
2d70e5a
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses agent summarization failures by tightening token-budget calculations to account for tool schemas in summarization contexts and by preventing the summarization instruction UserMessage from being pruned under budget pressure.
Changes:
- Count all tool tokens in the agent prompt budgeting path to avoid overestimating available message budget when summarization runs with fully-counted tools.
- Reserve tool-token budget when rendering the Full-mode summarization prompt, and give the summarization instruction
UserMessagean explicit priority. - Add a unit test capturing a “No messages provided” repro scenario under extremely small token budgets.
Show a summary per file
| File | Description |
|---|---|
| src/extension/prompts/node/agent/test/summarization.spec.tsx | Adds a repro-oriented test around empty rendered message arrays under tiny budgets. |
| src/extension/prompts/node/agent/summarizedConversationHistory.tsx | Ensures summarization instruction UserMessage has priority; reserves prompt budget for tool schemas in Full summarization mode. |
| src/extension/intents/node/agentIntent.ts | Changes tool token counting to include all tools (including deferred) and removes tool filtering for summarization prompt context. |
Copilot's findings
- Files reviewed: 6/6 changed files
- Comments generated: 3
vijayupadya
approved these changes
Apr 6, 2026
DonJayamanne
added a commit
that referenced
this pull request
Apr 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related: https://github.com/microsoft/vscode-internalbacklog/issues/7331
Problem
Two issues causing summarization failures:
1. Tool token under-counting (context_length_exceeded → "No messages provided")
The 3/30 change (#4834) excluded deferred tools from
toolTokens, making the message budget ~31K too generous. While deferred tools usedefer_loading: truein the main agent loop (so Anthropic does not count them), the summarization call usesChatLocation.Otherwhere all tools count fully. The overly generous budget causes conversations to grow larger than the actual context window can hold, leading to cascading summarization failures.Fix: Count all tool tokens (including deferred) in the budget calculation. This is conservative for the main render but safe — and ensures summarization is triggered earlier before the context window overflows.
Also applies the same tool-token-aware budget reduction inside
getSummary(Full)so the summarization prompt render leaves room for tool schemas.2. Summarization UserMessage has no priority (empty messages)
The final "Summarize the conversation..."
UserMessageinConversationHistorySummarizationPrompthad no explicit priority (defaulting to 0). Under tight budgets, prompt-tsx would keep the high-priority SystemMessage (900) and prune the UserMessage, resulting in only system messages → Anthropic extracts those tosystemfield →messagesarray is empty → "No messages provided".Fix: Set
priority={this.props.priority}on the UserMessage to match the SystemMessage priority.Changes
agentIntent.ts: Count all tools fortoolTokens(removeeffectiveToolsfiltering)summarizedConversationHistory.tsx: Add tool-token budget reduction ingetSummary(Full), add priority to summarization UserMessagesummarization.spec.tsx: Add repro test for empty messages with small budget